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of  a  target  point,  and  the  instant  it  is  visited.  In  this  paper,  such  a  routing 
problem  is  considered  for  a  team  of  Uninhabited  Aerial  Vehicles  (UAVs), 
modeled  as  vehicles  moving  with  constant  forward  speed  along  paths  of 
bounded  curvature.  Three  algorithms  are  presented,  each  designed  for  a 
distinct  set  of  operating  conditions.  Each  is  proven  to  provide  a  system 
time  within  a  constant  factor  of  the  optimal  when  operating  under  the  ap¬ 
propriate  conditions.  It  is  shown  that  the  optimal  routing  policy  depends 
on  problem  parameters  such  as  the  workload  per  vehicle  and  the  vehicle 
density  in  the  environment.  Finally,  there  is  discussion  of  a  phase  transi¬ 
tion  between  two  of  the  policies  as  the  problem  parameters  are  varied.  In 
particular,  for  the  case  in  which  targets  appear  sporadically,  a  dimension¬ 
less  parameter  is  identified  which  completely  captures  this  phase  transition 
and  an  estimate  of  the  critical  value  of  the  parameter  is  provided. 
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area  of  the  bounded  domain,  m2 

area  per  vehicle  in  the  unbounded  domain,  m2 

nonholonomic  vehicle  density 

configuation-to-configuration  Dubins  distance  function,  m 
set  of  targets  outstanding  at  time  t 
configuration  (position  and  heading)  of  vehicle 
multi-median  function,  m 

minimum  value  of  the  multi-median  function,  m 
the  set  of  integers  between  1  and  m,  included 
configuration-to-position  Dubins  distance  function,  m 
number  of  UAVs 

steady-state  number  of  outstanding  targets 
position  of  the  Voronoi  generators,  m 
point  in  the  domain,  m 
a  bounded  planar  domain 

the  set  of  positions  reachable  from  configuration  g  within  time  t 

Special  Euclidean  group  in  two  dimensions  (i.e.,  the  group  of  rigid  planar  motions) 

time,  s 

the  steady-state  system  time,  s 
vehicle  speed,  ms-1 

width  and  height  of  a  box  bounding  the  domain  Q,  m 

target  generation  rate,  s-1 

a  routing  policy 

minimum  turning  radius,  m 

angular  velocity,  rad  s-1 

vehicle  index 
target  index 


I.  Introduction 

Wide-area  surveillance  is  one  of  the  prototypical  missions  for  Uninhabited  Aerial  Vehicles 
(UAVs),  e.g.,  in  environmental  monitoring,  security,  or  military  settings.  Low-altitude  UAVs 
on  such  a  mission  must  provide  coverage  of  a  region  and  investigate  events  of  interest  as  they 
manifest  themselves.  In  particular,  it  is  of  interest  to  consider  cases  in  which  close-range 
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information  is  required  on  targets  detected  by  high-altitude  aircraft,  spacecraft,  or  ground 
spotters,  and  the  UAVs  must  proceed  to  their  locations  to  gather  on-site  information. 

In  many  studies  of  problems  falling  into  this  class,  the  locations  of  targets  are  known  a 
priori,  and  a  strategy  is  computed  that  attempts  to  optimize  the  cost  of  servicing  the  known 
targets.  In  this  paper,  a  scenario  is  considered  in  which  the  target  points  are  generated 
dynamically,  with  only  prior  statistics  on  their  location,  and  a  policy  is  designed  to  minimize 
the  expected  time  a  target  waits  to  be  visited.  This  formulation  is  a  variation  of  the  Dynamic 
Traveling  Repairman  Problem  (DTRP) — introduced  by  Psaraftis1  and  thoroughly  developed 
by  Bertsimas  and  van  Ryzin2  4 —  with  the  addition  of  differential  constraints  on  the  vehicle’s 
motion.  In  particular,  this  paper  concentrates  on  vehicles  that  are  constrained  to  move  along 
paths  of  bounded  curvature,  without  reversing  direction.  Such  vehicles,  often  referred  to  as 
Dubins  vehicles,  have  been  extensively  studied  in  the  robotics  and  control  literature. 5-7 
Moreover,  the  Dubins  vehicle  model  is  widely  accepted  as  reasonably  accurate  to  represent 
aircraft  kinematics,  e.g.,  for  air  traffic  control8,9  and  UAV  mission  planning  purposes.10, 11 

The  DTRP  analysis  in  the  literature2-4  is  broken  into  two  limiting  cases:  i)  targets 
are  generated  sporadically  ( light  load),  and  ii)  targets  are  generated  rapidly  ( heavy  load). 
In  heavy  load,  the  optimal  policy  for  holonomic  vehicles  relies  upon  Euclidean  Traveling 
Salesman  tours  through  large  sets  of  target  points.  Policies  based  on  Euclidean  distance  are 
known  to  provide  very  poor  performance  for  UAVs  modeled  as  nonholonomic  vehicles  for  high 
target  densities;12  in  this  case,  we  use  recently-developed  algorithms  that  aim  at  minimizing 
the  travel  distance,  taking  into  account  the  turning  cost  between  consecutive  targets.13  In 
light  load,  rather  than  planning  tours  through  targets,  the  challenge  is  to  design  loitering 
patterns  or  policies  with  the  property  that,  when  a  target  does  appear,  the  expected  wait 
for  the  “closest”  UAV  to  arrive  is  minimized.  This  problem  is  inherently  more  complex 
for  nonholonomic  than  for  holonomic  vehicles:  for  holonomic  vehicles  the  light  load  DTRP 
reduces  to  a  choice  of  waiting  locations  and  solutions  are  known  from  the  well-developed 
locational  optimization  literature.14 

The  recent  literature  concerning  DTRP  and  aerial  surveillance  problems  is  vast.  This 
paper  is  along  the  vehicle  routing  themes  of  some  of  our  previous  work.13,15,16  Many  of 
the  new  results  for  the  light  load  were  introduced  in  Enright  et  al.17  and  are  applicable  to 
coverage  problems,18-21  in  which  the  vehicles  spread  out  uniformly,  or  comb  the  environment 
efficiently.  The  light  load  case  discussed  in  this  paper  has  connections  to  the  Persistent  Area 
Denial  (PAD)  and  area  coverage  problems.18,22-24  Moreover,  the  mathematical  structure  of 
the  light  load  case  has  a  strong  resemblance  with  continuous  facility  location  problems,14,25,26 
with  the  main  difference  in  our  case  being  that  the  facilities  are  vehicles  constantly  in  motion 
with  nontrivial  dynamics.  The  heavy  load  case  is  strongly  related  to  works  concerned  with 
the  generation  of  efficient  cooperative  strategies  for  several  vehicles  moving  through  given 
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target  points  and  possibly  avoiding  obstacles  or  threats.11, 27-30  Trajectory  efficiency  in  these 
cases  is  understood  in  terms  of  cost  for  the  vehicles:  in  other  words,  efficient  trajectories 
minimize  the  total  path  length,  the  time  needed  to  complete  the  task,  or  the  fuel/energy 
expenditure.  A  related  problem  has  been  investigated  as  the  Weapon-Target  Assignment 
(WTA)  problem,  in  which  vehicles  are  allowed  to  team  up  in  order  to  enhance  the  prob¬ 
ability  of  a  favorable  outcome  in  a  target  engagement.31,32  Other  related  works  address 
UAV  task-assignment:33-36  in  this  setup,  targets  locations  are  known  and  an  assignment 
strategy  is  sought  that  maximizes  a  global  performance  metric  such  as  success  rate.  Other 
works  addressing  cooperative  task  completion  by  UAVs  include.37,38  Some  other  works,  like 
this  paper,  address  coverage  and  sensing  tasks  with  only  prior  statistics  on  the  operating 
environment.39 

The  main  contributions  of  the  paper  are  the  following. 

1.  A  novel  approach  to  establish  a  lower  bound  on  the  system  time  for  Dubins  DTRP  is 
formulated.  The  approach  explores  properties  of  the  reachable  set  of  the  Dubins  vehicle 
and  may  be  of  independent  interest  as  it  can  be  applied  to  other  vehicles  performing 
similar  tasks. 

2.  Three  algorithms  have  been  proposed  and  it  is  proven  that,  for  certain  range  of  the 
problem  parameters,  they  perform  within  a  constant  factor  of  the  optimum,  i.e.,  the 
upper  bounds  on  their  performances  have  the  same  form  (in  terms  of  team  size  m, 
speed  v  and  target-generation  rate  A)  as  the  lower  bounds.  Specifically,  two  strategies 
for  the  light  load  are  designed.  One  strategy  assigns  regions  of  dominance  to  the 
vehicles  and  requires  the  vehicles  to  guard  their  own  territory,  thus  requiring  them  to 
spread  out.  This  strategy  is  optimal  when  vehicle  density  is  low.  The  other  strategy 
requires  the  vehicles  to  move  as  a  group  in  a  coordinated  pattern,  balancing  the  amount 
of  space  directly  in  front  of  each  vehicle.  This  strategy  is  within  a  constant  factor  of 
the  optimal  for  high  vehicle  densities.  In  the  heavy  load,  a  previously  proposed  single¬ 
vehicle  policy13  is  extended  to  the  multiple-vehicle  setting. 

3.  In  light  load,  the  optimal  system  time  is  shown  to  be  of  order  1  /{vf/m),  contrasting  the 
holonomic  case  which  yields  1  /(vy/m).  The  added  performance  per  vehicle  is  penalized 
by  the  nonholonomic  motion  constraints.  In  heavy  load,  the  system  time  is  shown  to 
be  of  order  A2/ (mu)3,  i.e.,  the  system  time  is  more  sensitive  to  the  workload  per  vehicle 
for  nonholonomic  vehicles  than  for  holonomic  vehicles  (where  the  systeme  time  is  of 
order  A/ (mu)2). 

4.  A  consequence  of  the  aforementioned  analytical  results  is  recognized:  an  optimal  rout¬ 
ing  policy  for  the  UAVs  exhibits  phase  transitions  between  these  three  algorithms  as 
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one  varies  the  problem  parameters.  For  the  light  load  case,  a  dimensionless  parameter 
that  completely  captures  the  phase  transition  is  identified.  An  estimate  of  the  critical 
value  of  the  parameter  that  separates  the  phases  is  also  provided. 

Other  researchers,  including  Vicsek  et  al.40  and,  more  recently,  Spears  et  al.,41  have 
investigated  phase  transitions  in  the  behavior  of  large  groups  of  vehicles.  In  these  works, 
the  individuals  follow  local  interaction  laws,  sometimes  inspired  by  physical  laws,  which  are 
shown  to  give  rise  to  different  modes  of  group  behavior,  depending  on  design  parameters 
and  possible  input  noise.  In  some  cases,  these  types  of  approaches  are  useful  for  accomplish¬ 
ing  large-scale  multi-vehicle  coverage  and  similar  tasks,  in  a  distributed,  robust  fashion.  In 
contrast,  a  top-down  approach  is  taken  in  this  paper.  Beginning  with  simple  motion  con¬ 
straints  (found  in  nature  and  technology)  for  the  individual  vehicles,  and  an  explicit  task 
and  performance  measurement,  the  aim  is  to  design  control  strategies  achieving  near  opti¬ 
mal  performance.  As  a  result  of  the  analysis,  the  optimal  policy  is  noticed  to  exhibit  more 
than  one  distinct  mode  of  behavior.  Rather  than  attempting  to  imitate  nature,  an  attempt 
is  made  in  this  paper  to  divulge  a  possible  cause  (efficiency)  behind  natural  multi-agent 
systems  which  exhibit  phase  transitions  between  territorial  and  gregarious  behavior.42 

The  rest  of  the  paper  is  organized  as  follows.  In  Section  II,  the  problem  is  set  up  and 
some  preliminary  concepts  and  notations  are  introduced.  In  Section  III,  the  case  in  which 
targets  appear  sporadically  is  addressed,  and  in  Section  IV,  the  case  in  which  targets  are 
generated  rapidly  is  investigated.  A  phase  transition  in  the  optimal  routing  policy  is  studied 
in  Section  V.  Finally,  some  concluding  remarks  are  made  and  directions  for  future  research 
are  discussed  in  Section  VI.  A  detailed  exposition  of  some  relevant  results  about  minimum- 
length  Dubins  paths  is  given  in  Appendix  A,  and  some  technical  lemmas  along  with  their 
proofs  are  given  in  Appendix  B. 

II.  Problem  Formulation  and  Preliminary  Concepts 

II. A.  Problem  Formulation 

In  this  section,  the  Dubins  DTRP  is  formulated  and  some  preliminary  concepts  for  minimum¬ 
time  paths  and  reachable  sets  for  the  Dubins  vehicle  are  presented.  Let  Q  C  R2  be  a  convex, 
compact  domain  on  the  plane,  with  non-empty  interior;  Q  is  assumed  to  be  at  the  ground 
level,  and  will  be  referred  to  as  the  environment.  Let  A  be  the  area  of  Q.  A  spatio-temporal 
Poisson  process,  with  finite  time  intensity  A  >  0  and  uniform  spatial  density  in  Q,  generates 
targets  inside  the  environment.  This  Poisson  process  can  be  thought  of  as  a  collection  of 
functions  {V  :  R+  — *  2s}  such  that,  for  any  t  >  0,  V(t)  is  a  random  collection  of  points  in 
Q,  representing  the  targets  generated  in  the  time  interval  [0,  t),  and  such  that 
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•  the  number  of  targets  generated  in  two  disjoint  time- space  regions  are  independent 
random  variables; 

•  the  expected  number  of  targets  generated  in  a  region  S  C  Q  during  a  time  interval  of 
length  At  is 

E[card  ((V(t  +  At)  \  V(t))  fl  «S)]  =  XAt  •  , 

The  information  about  the  locations  of  these  dynamically  generated  targets  is  provided 
by  surveillance  assets  (e.g.,  high-altitude  aircraft  or  spacecraft,  ground  spotters,  sensor  net¬ 
works,  etc.),  and  is  assumed  to  be  broadcast  to  the  UAVs  instantaneously.  Each  target 
corresponds  to  a  service  request.  These  service  requests  are  identical  and  are  to  be  fulfilled 
by  a  team  of  m  UAVs.  A  service  request  is  fulfilled  when  one  of  m  UAVs  flies  over  the  asso¬ 
ciated  target  point  (assumed  to  be  on  the  ground).  The  UAVs  operate  on  horizontal  planes, 
at  different  altitudes  to  avoid  collisions.  Also,  we  assume  that  all  these  horizontal  planes  are 
high  enough  to  clear  all  terrain  and  obstructions  on  the  ground.  We  do  not  model  no-fly 
zones,  and  leave  this  to  future  work.  Since  the  flying  altitude  does  not  play  a  significant  role 
in  the  course  of  the  paper,  the  motions  of  all  the  UAVs  are  projected  onto  the  plane  con¬ 
taining  Q  and  restrict  our  attention  to  this  plane.  Henceforth,  the  location  of  the  UAV  shall 
be  identified  with  its  projection  onto  the  ground  plane.  Finally,  we  assume  that  the  UAVs 
maintain  a  constant  speed;  this  assumption  is  consistent  with  standard  operational  practice, 
as  aircraft  typically  fly  most  of  their  mission  at  a  constant  cruising  speed,  calculated  in  such 
a  way  to  maximize  range  or  endurance.43 

The  motion  of  the  UAVs  will  be  modeled  by  a  kinematic  model  in  R2.  In  particular,  the 
UAVs  are  modeled  as  nonholonomic  vehicles  constrained  to  move  on  the  plane  at  constant 
speed,  v  >  0,  along  paths  with  a  minimum  radius  of  curvature  p  >  0.  In  other  words,  if 
the  configuration  gt  G  SE(2)  of  the  ?'-th  vehicle  (Vi  G  Im  where  Im  =  {1, . . . ,  m})  is  given 
in  coordinates  by  gt  =  (x,,  Of) — where  yL  are  the  projections  of  the  vehicle’s  position 
along  inertially  fixed  orthogonal  axes,  and  0t  is  the  orientation  of  the  vehicle’s  heading  with 
respect  to  the  a;- axis — then  the  model  of  the  vehicle  is  given  by: 

Xi  =  ucos(6|j), 

Vi  =  vsin(0i), 

0i  =  (x>i,  \ui\  <  v/p. 

This  model  is  often  referred  to  as  the  Dubins  vehicle,  in  recognition  of  Dubins’  work  in 
computing  minimum-length  paths  for  such  model.5  A  feasible  path  for  the  Dubins  vehicle 
is  defined  as  a  path  that  is  twice  differentiable  almost  everywhere,  and  such  that  its  radius 
of  curvature  is  bounded  below  by  p.  The  UAVs  are  assumed  to  be  identical,  and  have 
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unlimited  range.  The  above  kinematic  model  is  very  common  in  the  literature  on  UAV 
motion  planning.10,11  Results  in  terms  of  minimum- length  paths  for  Dubins  vehicle5  hold 
for  our  model,  where  they  assume  an  additional  connotation  of  being  minimum-time  paths 
for  vehicles  with  constant  speeds.  In  the  course  of  this  paper,  the  term  Dubins  frame  shall 
be  used  to  refer  to  a  coordinate  frame  with  the  origin  attached  to  the  Dubins  vehicle  and 
a;- axis  along  the  vehicle’s  velocity  vector. 

Let  V  :  t  — »  2  s  indicate  a  realization  of  the  stochastic  process  obtained  by  combining 
the  target-generation  process  V  and  the  removal  process  caused  by  the  vehicles  visiting  the 
outstanding  requests.  The  random  set  T>(t)  C  Q  represents  the  demand ,  i.e.,  the  service 
requests  outstanding  at  time  t;  n(t )  is  defined  to  be  the  cardinality,  i.e.,  the  number  of 
elements,  of  the  set  T>(t). 

A  motion  coordination  policy  is  a  function  that  determines  the  actions  of  each  vehicle 
over  time.  For  the  time  being,  denote  these  functions  will  be  denoted  as  7r  =  (7Ti,  7:2,  ■  ■  ■ ,  7rm), 
without  explicitly  stating  their  domain;  the  output  of  these  functions  is  a  steering  command 
for  each  vehicle.  The  objective  is  the  design  of  motion  coordination  strategies  that  allow  the 
vehicles  to  fulfill  service  requests  efficiently. 

A  policy  7T  =  (7Ti,  7t2,  . . . ,  7 Tm)  is  said  to  be  stabilizing  if,  under  its  effect,  the  expected 
number  of  outstanding  targets  does  not  diverge  over  time,  i.e.,  if 

nn  =  lim  E  [n(t)  :  vehicle  i  executes  policy  7T;,7  E  Im]  <  +oo. 

t^  +  OO 

Intuitively,  a  policy  is  stabilizing  if  the  vehicles  are  able  to  visit  targets  at  a  rate  that  is, 
on  average,  at  least  as  fast  as  the  rate  at  which  new  service  requests  are  generated.  Let  Tj 
be  the  time  elapsed  between  the  generation  of  the  j-th  service  request,  and  the  time  it  is 
fulfilled.  If  the  system  is  stable,  then  the  following  balance  equation  (also  known  as  Little’s 
formula44)  holds: 

nn  =  XTn,  (1) 

where  Tn  :=  lim/_+oc  E[T)]  is  the  system  time  under  policy  7 r,  i.e.,  the  expected  time  a 
service  request  must  wait  before  being  fulfilled,  given  that  the  vehicles  follow  the  strategy 
defined  by  n.  The  system  time  Tn  can  be  thought  of  as  a  measure  of  the  quality  of  service 
collectively  provided  by  the  vehicles. 

At  this  point  the  problem  can  be  finally  stated  as:  we  wish  to  devise  policies  that  are  (i) 
stabilizing,  and  (ii)  yield  a  quality  of  service  (i.e.,  system  time)  achieving,  or  approximating, 
the  theoretical  optimal  performance  given  by 

Topt  =  inf  Tn 

7 r  stabilizing 
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It  is  of  interest  to  design  computationally  efficient  control  policies  that  are  within  a  constant 
factor  of  the  optimal,  i.e. ,  policies  n  such  that  Tn  <  h~Topt  for  some  constant  k.  In  the  rest 
of  the  paper,  the  terms  policy  and  algorithm  are  used  interchangeably. 

Since  it  is  difficult  to  characterize  the  optimal  achievable  performance  for  general  values 
of  the  problem  parameters,  the  focus  is  on  particular  range  of  problem  parameters.  To  that 
purpose,  ^  is  defined  as  the  vehicle  workload.  This  parameter  characterizes  the  average  rate 
at  which  each  vehicle  must  service  requests  in  order  for  the  number  of  outstanding  targets 
to  remain  bounded.  It  is  found  that  the  asymptotic  cases  of  A/m  — »  0+  (light  load)  and 
A/ra  — >•  Too  (heavy  load)  lend  themselves  to  succinct  and  meaningful  analysis. 

This  section  is  concluded  by  describing  some  notation  providing  a  concise  way  of  ex¬ 
pressing  the  asymptotic  dependence  of  the  system  time  T  on  the  problem  parameters.  For 
/,  g  :  N  — >  M,  /  G  0(g)  (respectively,  /  G  Ll(g))  if  there  exist  N0  G  N  and  k  G  M+  such 
that  \f(N)\  <  k\g(N)\  for  all  N  >  N0  (respectively,  \f(N)\  >  k\g(N)\  for  all  N  >  N0).  If 
/  G  0(g)  and  /  G  Ll(g),  then  the  notation  /  G  ©(g)  is  used. 

II. B.  Minimum-length  paths  and  the  reachable  set  for  the  Dubins  vehicle 

Minimum-length  Dubins  paths  between  two  configurations  have  been  extensively  studied, 
due  to  their  importance  in  mobile  robotics.  A  full  characterization  of  optimal  paths  is  given 
in  Dubins;5  a  further  classification  is  given  in  Shkel  and  Lumelsky.45  Our  purpose  in  this 
section  is  to  study  the  length  of  optimal  paths  given  different  boundary  conditions.  Namely, 
the  problem  of  steering  a  Dubins  vehicle  from  a  given  initial  configuration  to  a  point  in  the 
plane  will  be  considered;  the  difference  with  the  original  problem  posed  by  Dubins5  is  that  the 
final  heading  at  the  target  point  is  not  constrained  a  priori.  Let  Lp(g,  q)  :  SE(2)  x  M2  — ■>  M+ 
be  the  length  of  shortest  feasible  path  for  a  Dubins  vehicle  to  steer  from  initial  configuration 
g  in  SE(2)  to  a  point  q  in  the  plane,  without  any  constraints  on  the  final  heading,  i.e.,  to  any 
configuration  in  the  set  \q}  x  S1  C  SE(2).  The  characteristics  of  paths  of  minimal  length 
with  such  boundary  conditions  were  studied  in  Thomaschewski,46  where  it  is  provem  that  all 
such  paths  are  a  concatenation  of  an  arc  of  a  minimum-radius  circle  (either  in  the  positive  or 
negative  direction) ,  with  either  an  arc  of  a  minimum-radius  circle  (in  the  opposite  direction) , 
or  with  a  straight  segment.  The  length  of  the  circular  arcs  is  at  most  2i rp,  and  all  subpaths 
are  allowed  to  have  zero  length.  In  the  Dubins  formalism,  such  paths  are  called  either  of  CC 
(circle,  circle)  or  of  CL  (circle,  line)  type,  and  include  the  subtypes  C  or  L.  arising  when 
the  length  of  either  one  of  the  two  subpaths  is  zero.  Clearly,  both  subpaths  have  zero  length 
in  the  trivial  case  in  which  q  is  the  initial  position  of  the  vehicle.  Furthermore,  the  optimal 
paths  are  of  type  CC  when  q  lies  within  the  interior  of  one  of  the  two  circles  of  radius  p 
that  are  tangent  to  the  vehicle’s  direction  of  motion  at  the  initial  configuration,  and  are  of 
type  CL  otherwise.  Closed-form  expressions  for  the  lengths  of  such  paths,  and  a  detailed 
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exposition  on  the  function  Lp ( g, ,  q),  are  given  in  Appendix  A.  For  now,  an  upper  bound  on 
Lp(gi,  q )  which  will  be  useful  in  analyzing  one  of  the  presented  policies  is  presented. 

Proposition  1  For  any  q  G  M2  and  g0  =  (0,0,0),  there  exist  constants  C\  >  0,c2  >  0  such 
that  the  function  Lp(go,  q)  satisfies  the  following  inequality. 

Lp(g0,q)  <dp+\\q- (0,±c2p)\\.  (2) 

In  particular,  the  assertion  is  true  for  c\  ~  9.39  and  c2  =  1. 

Proof:  Let  Dp(gi,g2 )  be  the  length  of  shortest  feasible  path  for  a  Dubins  vehicle  to 
steer  from  initial  configuration  g  in  SE(2)  to  a  final  configuration  g2  G  SE(2).  Since,  Lp(g0,  q) 
is  the  length  of  the  shortest  path  from  go  to  q,  one  can  verify  that  it  satisfies  the  following 
relation  for  any  qc  G  M2. 


Lp(go,q)  <  max  Dp(g0,  ( qc,6 ))  +  \\q  -  qc ||.  (3) 

|0,27t) 

The  following  bound  is  proven  in  Savla  et  al .:13 

Dp(g0,  (qc,  9))  <  ||gc||  +  8.39p  V6  G  [0, 27 r).  (4) 

The  result  follows  by  substituting  Equation  (4)  into  Equation  (3)  for  qc  =  (0,  ±p).  ■ 

Remark  2  An  implication  of  Proposition  1  is  that,  if  a  vehicle  is  moving  along  a  circle  of 
radius  c2p  about  a  point  gc;rc,  then  it  can  reach  any  point  q  G  M2  with  a  path  of  length  no 
greater  than  c\p  +  \\q  —  gcjrc||  at  any  time  during  its  circular  motion. 

Remark  3  Using  numerical  optimization,  it  is  found  that,  by  selecting  c2  ~  2.91,  one  could 
reduce  c\  to  approximately  3.76. 

Remark  4  An  upper  bound  on  Lp(g0,q )  under  the  simplifying  assumption  of  ||g||  >  2 p  is 
also  obtained  in  Rathinam  et  aid7 

The  reachable  set  for  a  Dubins  vehicle  and  a  property  that  will  be  useful  later  in  the 
paper  are  defined  next.  Given  an  initial  condition  gt ,  the  reachable  set  R<t(g,)  C  SE(2)  for 
a  Dubins  vehicle  is  the  set  of  final  configurations  gj  such  that  there  exist  admissible  controls 
to  drive  the  vehicle  from  the  initial  configuration  g%  to  the  final  configuration  gj  within  time 
t.  It  is  well  known  that  the  Dubins  vehicle  is  not  small-time  locally  controllable ,  i.e.,  there 
exist  t  >  0  for  which  the  reachable  set  R<t(gi )  does  not  contain  an  e-neighborhood  of  gt .  It 
is  of  interest  in  this  paper  to  focus  on  the  reachable  region  on  the  Euclidean  plane.  For  any 
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g  =  (p,  9)  G  SE(2),  define  the  projection  operator  P  :  SE(2)  — >  M2  as  P(g)  =  p.  Furthermore, 
for  any  R  C  SE(2)  let  P(R)  =  U geRP(g).  For  simplicity  of  notation  the  reachable  region  on 
the  plane,  P(R<t(gi)),  shall  be  henceforth  indicated  with  Rt(gt). 

The  minimum-length  path  to  any  point  within  the  interior  of  one  of  the  minimum  turning 
radius  circles  has  a  length  of  at  least  up,  and  so  the  shortest  path  to  any  point  reachable 
within  time  t  <  np/(2v)  is  of  type  CL.  In  Fig.  1,  Rt(gi )  for  some  time  t  <  np/(2v)  is 
depicted,  along  with  a  generic  minimum-length  path  of  length  vt.  Let  the  length  of  the  C 
segment  be  denoted  by  s  and  the  angle  it  sweeps  out  with  9.  Since  s  =  p6,  the  length  of 
the  L  segment  is  vt  —  s  =  vt  —  p6.  Also,  the  switching  point  between  the  C  and  L  segments 
is  denoted  with  ( x\,y\ ),  the  point  at  the  end  of  the  L  segment  with  (2,2 •  1)2) ■  Elementary 


\ 


\ 


,(*2,2/2) 


X 


N 


/ 


\ 


/ 


\ 


+ 


Figure  1.  The  derivation  of  Area,(Rt(gi))  f°r  t  <  7rp/(2v). 


planar  geometry  computations  yield 


X\  =  psind, 

2/i  =  p(l-cos0), 

£2  =  xi  +  (vt  —  p6)  cos  9, 

2/2  =  2/1  +  (vt  ~  p9)  sin  9. 


Moreover,  for  p  and  t  such  that  t  <  itp/ (2v) 


(sin 2  9(vt/p  —  9)2  +  sin 9(1  —  cos  9)(vt/p  —  9)  +  cos  0(1  —  cos0))  d9. 
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The  above  integral  can  be  computed  as 

Area (Rt(gi))  =  for  t  <  (5) 

Remark  5  Equation  (5)  shows  that  Area (Rt(gi))  approaches  zero  as  (vt)3/p,  i.e.,  the  area 
of  this  reachable  set  decreases  faster  than  the  area  of  a  circle  of  radius  vt  as  vt/p  ^  0+. 

Equation  (5)  also  implies  the  following  result. 

Proposition  6  For  a  given  a,  p  >  0  and  gi  e  SE(2),  let  r  be  such  that  Area(i2T(pi))  =  a. 
Then, 

t  =  —  ( 3pa )1^3  for  a/p2  <  7r3/24. 

III.  Light  Load 

In  this  section  the  case  in  which  new  targets  appear  very  rarely  is  addressed,  and  lower 
bounds  and  constructive  upper  bounds  on  the  system  time  are  provided. 

III. A.  Lower  Bounds 

A  review  of  the  known  results  for  the  Euclidean  case  is  provided  first,  which  are  then  used 
to  derive  lower  bounds. 

Euclidean  Lower  Bound 

A  trivial  lower  bound  on  the  system  time  T  can  be  obtained  by  adopting  the  corresponding 
lower  bound  for  a  Euclidean  vehicle ,  i.e.,  a  vehicle  moving  with  unit  speed  and  having 
no  motion  constraints.  In  order  to  do  that,  a  brief  overview  of  a  related  problem  from 
geometric  optimization  is  necessary.  Given  a  convex,  compact  set  Q  C  M2  and  a  set  of 
points  p  =  {pi,P2,  ■  ■  ■  j Pm}  £  Qm ■  the  expected  distance  between  a  random  point  q,  sampled 
from  a  uniform  distribution  over  Q,  and  the  closest  point  in  p  is  given  by 

If  1  f 

Timip,  Q)  ■=  T  /  min  I \Pi  -  q\\  d<l  =  -7  J2  /  I \Pi  ~  ?ll  d(T 
A  JQ  A  JVi(p) 

where  V(p)  =  (Vi  (p),  V2 (p), ,  Vm(p))  is  the  Voronoi  (Dirichlet)  partition 48  of  Q  generated 
by  the  points  in  p,  i.e., 

Vi(p)  =  {qe  Q:  II?  —  Pill  <  \\q-Pj\\,Vi,j  e  /mi¬ 
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The  problem  of  choosing  p  to  minimize  Hm(p,  Q )  is  known  in  the  geometric  optimization 
literature  as  the  continuous  supply  /  continuous  demand  multi-median  problem.49,50  Related 
problems  include  computing  m- medians  and  centers  for  discrete  supply  and/or  demand  sets 
and  references  therein  . 14,25,26,51,52  The  m- median  of  the  set  Q  is  the  global  minimizer 
p*(Q )  =  argminpGQm  Hm(p,  Q)-  Let  =  Hm(p*(Q),Q )  be  the  global  minimum  of 

'HmiPi  Q)  for  fixed  Q  and  m.  It  is  straightforward  to  show  that  the  map  p  >  Hi(p ,  Q) 
is  differentiable  and  strictly  convex  on  Q.  Therefore,  it  is  a  simple  computational  task  to 
compute  p\(Q).  However,  the  map  p  i— >  7irn (p.  Q )  with  m  >  1  is  differentiable  (whenever 
(pi , . . . .  pm)  are  distinct)  but  not  convex,  thus  making  the  solution  of  the  continuous  m- 
median  problem  hard  in  the  general  case.  In  fact,  it  is  known26  that  the  discrete  version 
of  the  m- median  problem  is  NP-hard  for  d  >  2.  However,  numerical  techniques  for  finding 
local  minima  of  continuous  m-median  problems  can  be  designed  using  i)  the  “honeycomb 
heuristic”49  discussed  below,  and  ii)  the  fact  that  pi  is  the  median  of  Vt(p)  for  each  p,  in 
p*(Q).  An  adaptive  sampling-based  algorithm  to  compute  p,  converging  to  local  minima 
or  saddle  points  of  ?im(p,  Q )  is  given  in.53  The  issue  of  computation  of  the  m- median  and 
of  the  corresponding  7i*n(Q)  will  not  be  pursued  further,  but  it  will  be  assumed  that  these 
values  are  available. 

In  course  of  the  paper,  the  dependence  of  H^Q)  on  m  will  play  a  crucial  role  in  the  design 
and  analysis  of  algorithms.  However,  finding  the  exact  relationship  for  the  general  case  is 
difficult.  Hence,  now  the  problem  of  providing  some  tight  bounds  on  7d^(Q)  is  considered. 
This  problem  is  studied  thoroughly  in  Papadimitriou50  for  square  regions  and  in  Zemel49 
for  more  general  compact  regions.  It  is  known  that,  in  the  asymptotic  case  (m  — >  +oo), 
=  c^yfAjm  almost  surely,49  where  Chex  ~  0.377  is  the  first  moment  of  a  hexagon 
of  unit  area  about  its  center.  This  optimal  asymptotic  value  is  achieved  by  placing  the  m 
points  at  the  centers  of  the  hexagons  in  a  regular  hexagonal  lattice  within  Q  (the  honeycomb 
heuristic).  Working  towards  the  above  result,  it  is  also  known49  that  for  any  m  E  N: 


2 

3 


<  n*JQ )  <  c(Q) 


where  c(Q )  is  a  constant  depending  on  the  shape  of  Q.  The  above  result  is  mirrored  in  the 
following  Proposition  for  several  reasons.  First,  the  proof  technique  that  will  be  used  for  the 
lower  bound  foreshadows  much  of  the  logical  structure  used  in  the  Dubins  lower  bound  of 
Theorem  10,  without  the  additional  complexity  due  to  the  motion  constraints  of  the  Dubins 
vehicle.  Second,  the  proof  of  the  upper  bound  is  omitted  from  the  published  work.49  The 
proof  for  a  specific  example  of  such  a  constant  c(Q)  is  provided  next. 


Proposition  7  For  any  convex,  compact  Q  C  M2  of  area  A,  the  function  H^Q)  belongs  to 
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0(1  /y/rn).  In  particular, 


2 

3 


<  Kn(Q)  < 


csqA/3L(Q) 

y/rn 


where  csq  ~  0.38  and  L(Q )  is  the  side-length  of  the  smallest  square  containing  Q. 


Proof:  The  proof  of  the  upper  bound  is  given  first.  If  the  environment  Q  is  a  square 
of  side- length  L,  it  is  known54  that  H*(Q)  =  csqL,  with  csq  ~  0.38.  If  the  environment 
is  a  square  and  m  is  a  perfect  square  (i.e.,  \fm  is  integer),  then  the  environment  can  be 
divided  into  m  equally  sized  squares,  each  with  one  of  the  m  points  at  its  center,  and  so 
H*mjQ)  —  csq L/y/m.  For  an  environment  of  general  shape  and  a  perfect  square  m, 


Kn jQ)  < 


CsqLjQ) 

y/rn 


where  L(Q)  is  the  side-length  of  the  smallest  square  containing  Q.  This  holds  because  the 
optimal  value  for  a  square  containing  Q  is  strictly  greater  than  the  optimal  value  for  Q  itself. 
For  general  Q  and  m  only  |_  y/rn\ 2  of  the  m  points  are  utilized  and  the  same  technique  as 
earlier  is  used  to  get: 


n*m(Q)  < 


CsqLjQ) 

iv™\ 


Finally,  noting  that  \y/rn\  >  \frrif  \f)  for  any  m  G  hi,  the  stated  upper  bound  is  obtained. 

Now  the  lower  bound  is  proved.  Let  Ai  =  Area(V,;(p))  be  the  area  of  the  Voronoi  region 
belonging  to  vehicle  i  and  let  C a,  {'pi)  be  a  disk  of  area  A%  centered  at  p, .  Then 


KnjQ)  =  min  [  ^  min  \\Pi  -  q\\dq 

peQrn  J  q  A  ielm 


=  mm 

peQ m 


[  -JPi-q\\dq 

i=i  JVitp)  A 

m  f  1 


i=l  JcaM ) 

m  «  ^  m 

*  A**' subj«ct  to  Y,A  =  A, 


where  the  fact  that  for  a  given  Pi,  the  region  of  area  At  with  the  least  first  moment  about 
Pi  is  a  disk  of  that  area  is  used.  Note  that  the  radius  of  such  a  disk  is  r(Ai)  =  \/~AfJf.  It 
is  known  that  the  expected  distance  between  a  uniformly  distributed  point  within  a  disk  of 
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radius  r  and  its  center  is  2r/3.  Thus, 


min  V]  4 ilr(A), 

2=1 


mm 


?>A^K  {A1,A2,...,Am}eRrn  ^ 
V  2=1 


EV2, 


m 

subject  to  Ai  =  A, 

2=1 

m 

subject  to  Ai  =  A. 

2=1 


The  above  quantity  is  minimized  when  A,  =  A/m,  for  all  i,  yielding  the  stated  result.  Note 
that  the  above  lower  bound  holds  for  any  region  and  any  number  of  vehicles.  ■ 

The  following  result  was  first  proved  in  Bertsimas  and  Van  Ryzin2  for  holonomic  vehi¬ 
cles,  and  as  such  is  trivially  valid  for  nonholonomic  vehicles.  A  short  proof  is  reported  for 
completeness. 

Theorem  8  For  any  convex,  compact  Q  C  M2  of  area  A  and  A,  p,  v  €  M+  and  m  E  N,  the 
system  time  for  the  Dubins  DTRP  satisfies 

Topt  >  (6) 

Proof:  Consider  vehicle  i  with  configuration  <j,  and  position  pt.  Trivially,  Lfic/j,  q)  > 
\\pi  —  q\\  for  any  q  in  R2.  Thus, 


T0pt  A 


inf 

geSE(2)m 


r  i  i 

/  —  min -LJqi,  q)  dq  >  inf 
lQAi£imV  Pe( R2y 


=  mm 

peQm 


min  — 1|  pi 


A  ieir, 


v 


tfll  dq 


11  H*  (Q) 

mm-\\pz-q\\  dq  =  ^n^±. 


A  i£ln 


V 


V 


Remark  9  Proposition  7  and  Theorem  8  imply  that,  for  a  given  A  and  p,  Topt  belongs  to 


Dubins  Lower  Bound 

In  the  previous  section,  a  lower  bound  is  derived  by  approximating  the  Dubins  distance  with 
the  Euclidean  distance.  It  is  natural  to  expect  that  such  a  lower  bound  is  conservative.  In 
this  section,  a  new  lower  bound  is  derived  by  explicitly  considering  the  turning  cost  associated 
with  the  motion  of  a  Dubins  vehicle.  Consider  the  simple  scenario  of  a  single  Dubins  vehicle 
responsible  for  the  environment  Q,  in  the  case  where  targets  are  generated  very  infrequently. 
Since  the  vehicle  cannot  stop,  a  control  policy  must  include  a  loitering  path  while  waiting 
for  targets  to  appear.  This  may  be  a  predetermined  pattern,  or  possibly  a  randomized  or 
adaptive  path.  Towards  a  lower  bound,  consider  a  Dubins  vehicle  which  is  capable  of  waiting 
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in  a  static  configuration  (alternatively,  consider  that  for  any  loitering  path,  there  exists  a 
moment  in  the  pattern  for  which  the  expected  Dubins  distance  to  a  target  in  Q  is  minimal). 
If  the  UAVs  were  capable  of  stopping,  this  problem  could  be  reduced  to  finding  the  optimal 
configuration  of  the  m-vehicle  system.  Instead,  it  is  required  to  design  control  policies  that 
keep  the  configuration  near  optimal  in  a  time-averaged  sense. 

Considering  a  single-vehicle  system,  towards  a  lower  bound,  one  gets  that 

Topt  -  Jslm  JQ  M,Lp^9i' q)dq' 

Let  tp(A)  =  {f|  Area (Rt(gi))  =  A}.  The  function  tp  is  a  single-valued  function.  This  is 
because  the  area  of  a  reachable  set  Rt(gf)  is  a  strictly  increasing  function  of  t.  For  clarity,  in 
the  following  theorems,  let  us  specify  a  reachable  region  by  its  area,  i.e.,  RA^gf)  =  Rtp{A)(9i)- 
Now,  towards  a  lower  bound,  allow  the  environment  to  take  on  any  shape,  so  long  as  its  area 
remains  unchanged.  For  a  given  configuration,  gi:  the  region  S  of  area  A  with  least  expected 
Dubins  distance  frorng*,  i.e.,  fs  j^Lp(gi,  q )  dq,  is  the  reachable  region  of  area  A,  i.e.,  Ra (gf). 

Given  a  set  Q  C  M2  and  a  set  of  Dubins  configurations  g  =  {gi,g2,  ■  ■  ■  ,gm}  £  SE(2)TO, 
VV(g)  =  {DV1(g),..., 

T>Vm(g)}  is  defined  as  the  Dubins  Voronoi  partition  of  the  set  Q  generated  by  the  configu¬ 
rations  g.  In  other  words,  q  G  T>Vi(g)  if  Lp(gi,q)  <  Lp(gj,q),  for  all  i,j  G  Im.  Note  that  a 
vehicle’s  Dubins  Voronoi  region  is  exactly  its  instantaneous  region  of  dominance,  i.e.,  the  set 
of  points  which  can  be  reached  by  the  vehicle  faster  than  any  other.  A  new  lower  bound  on 
the  Dubins  DTRP  can  be  obtained  based  on  the  idea  of  finding  the  optimal  configuration 
for  the  m-vehicle  system,  as  if  they  were  capable  of  stopping: 

Theorem  10  For  any  convex,  compact  QcM2  of  area  A  and  A,  p,  v  G  M+  and  m  G  N,  the 
system  time  for  the  Dubins  DTRP  satisfies 

_  jyi  r 

Topt  ^  ~a  /  dq.  (7) 

J Ra  (di) 

m 

Proof:  In  the  following,  the  notation  Ai  =  Area (VVi(g))  is  used.  Beginning  with  the 
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definition  of  Topt,  one  has  that 


—  f  1  1 

Topt  >  inf  /  —  m.m-Lp(gi)q)dq 
ge SE(2)m  Jq  A  ielm  V 

m  r  I 

=  in*  y]  /  -rLp(gi,q)  dq 

g€SE(2)m  ^  yw.(s)  An 


m  /,  1 

m  ^  m 

—/(A,),  subject  to  A;  =  A  and  Aj  >  0  Vi  G  {1, 

2  Z 

where  /  :  M+  — ■>  M+  is  defined  by 


'  RAi  ( 9i ) 
m 

>  min 

{ni,A2,...,Am}eRm  '  Av ‘ 


,m}, 


/(A)  =  f  Lp(gi,q)  dq.  (8) 

J  R-Aigi) 

It  is  easy  to  verify  that  the  function  /  is  a  continuous,  strictly  increasing  function  of  A. 
Lemma  31  in  Appendix  B  shows  that  /  is  convex.  Thus,  by  using  the  Karush-Kuhn-Tucker 
conditions,55  one  can  show  that  the  quantity 

^  m  m 

min  >  f  (AA  subject  to  >  Aj  =  A 

{A1)A2)..,Am}6R»*  Av  4^  V  '  4^ 

z  z 

is  minimized  with  an  equitable  partition,  i.e.,  A*  =  A/m,  Vi.  This  yields  the  stated  result. 


Note  that  the  above  lower  bound  holds  for  any  region  and  any  number  of  vehicles.  Even 
though,  in  this  integral  form,  the  dependency  of  the  lower  bound  on  parameters  such  as  m 
and  p  is  not  transparent,  it  is  shown  that  a  lower  bound  can  be  obtained  by  dividing  the 
environment  into  equally  sized  domains  of  responsibility,  i.e.,  by  balancing  the  workload  in 
terms  of  coverage. 

Let  the  nonholonomic  vehicle  density  be  a  non-dimensional  parameter  defined  as 


dp 


p2m 

~A~ 


(9) 


The  motivation  of  this  parameter  is  shown  in  the  proof  of  the  following  lower  bound,  which 
holds  for  any  policy,  and  is  tightest  for  small  vehicle  workloads. 


Theorem  11  For  any  convex,  compact  Q  C  K2  of  area  A,  A ,p,v  G  1R+,  and  m  G  N,  the 
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system  time  for  the  Dubins  DTRP  satisfies 


1/3  3^3 

Proof:  Proposition  6  implies  that,  for  dp  >  24/-7T3, 


lim  inf  Topt  f  >  , 

dp^+OO  J  4:V 


tp(A/m)  =  -  f — 'j 
v  \  m  J 


1/3 


(10) 


The  Dubins  distance  to  any  point  q  in  the  reachable  set  Ra  (gf),  expressed  as  (x,y)  in  the 

m 

Dubins  frame,  can  be  bounded  by  Lp(gi:q)  >  x.  Let  (xi(9),y\(9))  =  (psin0,  p(l  —  cos 9)) 
represent  parametrized  points  on  the  minimum  turning  circle,  as  depicted  in  Figure  1.  Let 
RA^gfi)  :=  <  q  G  R^gf)  |  x  <  x\  (vtp(A/m)/p)  >,  i.e.,  RA^gfi)  is  the  set  of  points  in  Ra (gf) 
which  he  to  the  left  of  the  vertical  line  x  =  xi(vtp(A/m)/p).  Then, 


/  Lp(gi)q)dq>  /  xdq  >  /  xdq. 

jRA(9i)  dRA(gi)  d  Ra_  (di) 

m  mm 

The  last  integral  in  the  equation  above  can  be  evaluated  to  get  the  following  bound: 


f  rvtp(A/m)/p  j  (n\ 

/  Lp(gi,q)dq>  2  xx (9)y1(9) — r^-dO 

’RAiQi)  do 


dO 


fvtp(A/m)/p 

=  2p3  /  sin  6*  cos  0(1  —  cos  9)d9 

do 

rvtP{A/m)/p  v4t4(A/m ) 

=  P 3 1  ( e 3  +  o{93))d9  =  pA^L  l  +  p3 o(v4tp(A/m) / p4). 


(11) 

Note  that,  as  dp  — ►  +oo,  vtp(A/m) / p  goes  to  zero  from  Equation  (5).  Therefore,  applying 
Equation  (fO)  in  Equation  (11),  one  can  deduce  that,  as  dp  — >  +oo, 


lRA(9i)  4 m  \  m  J 


1/3 


Substituting  into  Equation  (7)  yields  the  result. 


Remark  12  Theorem  11  shows  that  Topt  belongs  to  fl  ^1  dn  particular,  for  fixed 

p  and  A,  Topt  belongs  to  D  * 

The  above  result  is  simply  the  integral  in  Theorem  10  carried  out  for  a  particular  asymp- 
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totic  region  of  the  problem  parameter  space.  It  is  reasonable  to  assume  that  a  policy  ap¬ 
proximating  this  lower  bound  would  somehow  allow  the  regions  of  dominance  to  remain 
balanced  throughout  time.  In  a  low  density  scenario,  this  requires  the  vehicles  to  simply 
spread  out.  However,  as  the  density  increases,  a  balanced  coverage  requires  that  a  vehicle’s 
region  of  dominance  be  very  small,  i.e.,  the  area  immediately  in  front  of  it.  Thus  the  optimal 
policy  in  this  phase  involves  dynamic  regions  of  dominance.  This  behavior  is  exhibited  by 
the  algorithms  proposed  in  the  next  section. 

III.B.  Policies  and  Upper  Bounds 

In  this  section,  control  policies  designed  for  the  light  load  are  proposed  and  then  analyzed  for 
their  performance.  The  first  policy — called  the  Median  Circling  policy — essentially  attempts 
to  imitate  the  optimal  policy  for  Euclidean  vehicles,  assigning  static  regions  of  responsibility. 
The  algorithm  is  formally  described  as  follows. 

The  Median  Circling  (MC)  Policy 

Each  vehicle  is  associated  with  a  generator  pt ,  i  G  Im.  Let  p*  be  the  m- median  of  Q.  and 
define  the  loitering  station  for  the  i-th  vehicle  as  a  circular  trajectory  of  radius  fyp  centered 
at  p*.  Each  vehicle  visits  all  targets  in  the  Voronoi  region  of  its  own  generator  V,  (p* ) ,  in  the 
order  in  which  they  arrive.  When  no  targets  are  available,  the  vehicle  returns  to  its  loitering 
station;  the  direction  in  which  the  orbit  is  followed  is  not  important,  and  can  be  chosen  in 
such  a  way  that  the  station  is  reached  in  minimum  time.  A  depiction  of  the  MC  policy  is 
shown  in  Fig.  2. 

Note  that  there  may  be  vehicles  closer  to  a  given  target  in  terms  of  Euclidean  distance 
or  Dubins  minimum-length  path.  However,  it  is  found  that  the  target-assignment  strategy 
described  above  lends  itself  to  tractable  analysis. 

Theorem  13  For  any  convex,  compact  Q  C  R2  of  area  A  and  p,  v  G  R+,  the  Median 
Circling  policy  is  a  stabilizing  policy  in  the  light  load,  i.e.,  as  A/m  — *  0+.  Moreover,  the 
system  time  of  the  Median  Circling  policy  satisfies  the  following,  as  A/m  — >  0+. 

7p  ^  'H'm(Q)  +  ci P 
J-mc  S - 

V 

Proof:  For  A/m  — >  0+,  after  an  initial  transient,  all  targets  will  be  generated  with  the 
vehicles  in  the  assigned  loitering  station,  and  an  empty  target  queue  with  probability  one. 
Therefore,  the  waiting  time  for  a  new  target  after  that  is  finite.  Hence,  the  Median  Circling 
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Figure  2.  Depiction  of  the  Median  Circling  policy.  The  yellow  squares  represent  p* ,  the  m- median  of  Q.  Each 
UAV  loiters  about  its  respective  generator  at  a  radius  C2/0.  The  regions  of  dominance  are  demarcated  according 
to  the  Voronoi  partition  generated  by  p* .  In  this  depiction,  a  target  has  appeared  in  the  subregion  roughly  in 
the  upper-right  quarter  of  the  domain.  The  UAV  responsible  for  this  subregion  has  left  its  loitering  station 
and  is  en  route  to  service  the  target. 


policy  is  stabilizing.  Moreover,  the  expected  waiting  time  for  a  new  target  is  given  by 


Tmc  =  [  4  min  -Lp(ghq)  dq 

Jq  A  16  Im  V 

r  i  i 

<  /  —  min  —(||p*  —  g||  +  Cip)  dq 


i^ilm  V 


E 


i= 1  JVi(p*) 
^■m(Q)  +  ClP 


Remark  14  In  other  words,  it  is  shown  that  the  system  time  achieved  by  the  MC  policy 
is  within  a  constant  additive  factor  c\p/v  ~  9.39 p/v  from  the  optimal.  The  additive  fac¬ 
tor,  which  can  be  considered  a  penalty  due  to  the  nonholonomic  constraints  imposed  on  the 
vehicle’s  dynamics,  depends  linearly  on  the  minimum  turn  radius  p. 

Note  that  the  upper  bound  stated  above  does  not  tend  to  zero  as  the  size  of  the  team 
grows  large.  A  comparison  is  now  made  between  the  performance  of  the  Median  Circling 
Policy  and  the  optimal  limit  shown  in  Theorem  8. 
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Theorem  15  The  system  time  of  the  Median  Circling  policy  in  light  load  satisfies 


In  particular, 


limsup  _ MC  <  1  +  25 y/dp. 

—  ^0+  -^opt 


lim  lim  sup  _MC  =  1 . 

dP^  0+  A^0+  Topt 


Proof:  Since  i\I <  Ti*n(Q)  from  Proposition  7, 


ciP<^f=H'JQ). 

2  A 


3  V  7r  m 


The  first  statement  in  the  theorem  then  follows  by  substituting  this  into  Equation  (12)  and 
applying  Theorem  8.  The  second  result  follows  by  taking  the  limit  as  dp  —>  0+  in  the  first 
statement.  ■ 


Remark  16  Theorem  15  implies  that  the  Median  Circling  Policy  is  an  efficient  policy  for 
scenarios  where  the  nonholonomic  vehicle  density  is  low.  In  low  density  scenarios,  Euclidean 
distance  dominates  the  cost,  and  thus,  a  policy  imitating  that  of  Euclidean  vehicles  is  near 
optimal. 


Simulation  Results  for  the  MC  policy 

The  values  for  speed  and  turning-radius  used  here  and  throughout  the  paper  in  numerical 
experiment  are  typical  of  current  operational  UAVs.56  The  speed  and  turning-radius  used 
are  v  =  50  ms-1  and  p  =  600  m,  respectively.  In  some  sets  of  data,  the  turning-radius  is 
varied,  in  which  case  its  range  of  values  remains  on  the  same  order  of  magnitude. 

Simulations  of  the  MC  policy  were  run  for  various  values  of  p  and  m.  For  each  combi¬ 
nation  of  conditions,  the  system  time  shown  in  Fig.  3  is  the  average  of  the  waiting  times 
of  one-hundred  thousand  target  points  uniformly  generated  in  the  environment.  These  nu¬ 
merical  results  confirm  theoretical  predictions  that  the  system  time  shrinks  with  1  / \fm  and 
grows  with  p:  the  log-log  plots  of  system  time  versus  number  of  vehicles  (left)  have  slopes 
ranging  from  —0.48  to  —0.52,  implying  a  power  law  of  approximately  —1/2,  and  the  linear- 
linear  plots  of  system  time  versus  minimum  turning  radius  (right)  have  slopes  ranging  from 
0.96  to  1.04,  implying  a  linear  dependence  on  p  with  a  coefficient  of  approximately  unity.  As 
the  minimum  turning  radius  becomes  very  small,  the  performance  of  the  MC  policy  approx¬ 
imates  the  lower  bound  valid  for  a  vehicle  without  kinematics  constraints,  i.e.,  as  p  — ■>  0+, 
Tmc  — ►  TC*(Q)/v. 
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NUMBER  OF  VEHICLES  MINIMUM  TURNING  RADIUS  (m) 


Figure  3.  Simulations  of  the  MC  policy  were  run  with  a  square  environment  of  area  A  =  9  X  1010  m2  and  vehicle 
speed  v  =  50  ms-1.  The  performance  of  the  MC  policy  is  plotted  as  a  function  of  i)  the  number  of  vehicles 
(left),  and  ii)  the  minimum  turning  radius  of  the  vehicles  (right).  The  nonholonomic  vehicle  densities  for  the 
scenarios  of  all  data  points  shown  lie  within  the  range  0  <  dp  <  0.0056. 


The  next  strategy  aims  to  (i)  maintain  balanced  coverage  despite  the  requirement  of 
dynamic  regions  of  dominance  due  to  high  vehicle  density,  and  (ii)  yield  a  system  time  that 
provably  tends  to  zero  as  the  size  of  the  team  grows  large. 


The  Strip  Loitering  (SL)  Policy 

The  Strip  Loitering  policy  is  based  on  the  following  idea.  Bound  the  environment  Q  with 
a  rectangle  of  minimum  height,  where  height  denotes  the  smaller  of  the  two  side  lengths  of 
a  rectangle.  Let  W  and  H  be  the  width  and  height  of  this  bounding  rectangle,  respectively. 
Divide  Q  into  strips  of  width  w,  where 


<*> 

Orient  the  strips  along  the  side  of  length  W.  Construct  a  closed  Dubins  path  which  runs  along 
the  longitudinal  bisector  of  each  strip,  visiting  all  strips  in  top-to-bottom  sequence,  making 
U-turns  between  strips  at  the  edges  of  Q,  and  finally  returning  to  the  initial  configuration. 
Note  that  the  path  must  cover  Q,  but  it  need  not  cover  the  entire  bounding  box  of  Q.  The 
bounding  box  is  merely  a  construction  used  to  place  an  upper  bound  on  the  total  path  length. 
The  m  UAVs  loiter  on  this  path,  equally  spaced,  in  terms  of  path  length.  A  depiction  of 
the  Strip  Loitering  policy  is  shown  in  Figure  4.  Two  distances  that  are  important  in  the 
analysis  of  this  policy  are  shown  in  Figure  5.  At  the  instant  a  target  arrives,  we  construct  a 
circle  of  radius  p  which  is  tangent  to  the  loitering  path  and  intersects  the  target.  Variable 
g?2  is  the  length  of  the  arc  departing  from  the  loitering  path  and  ending  at  the  target.  The 
UAV  responsible  for  visiting  the  target  is  the  one  closest  in  terms  of  loitering  path  length 
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(variable  d±)  to  the  point  of  departure,  at  the  time  of  target-arrival.  Note  that  there  may  be 
UAVs  closer  to  the  target  in  terms  of  Euclidean  distance  or  Dubins  minimum-length  path. 
However,  the  target-assignment  strategy  described  above  lends  itself  to  tractable  analysis. 

After  a  UAV  has  serviced  a  target,  it  must  return  to  its  place  in  the  loitering  pattern. 
The  following  method  to  accomplish  this  task  is  described  using  the  example  shown  in  Fig. 
5.  After  making  a  left  turn  of  length  d2  to  service  the  target,  the  UAV  makes  a  right  turn 
of  length  2 d2  followed  by  another  left  turn  of  length  d2,  returning  it  to  the  loitering  path. 
However,  the  UAV  has  fallen  behind  in  the  loitering  pattern  by  a  distance  4(d2  —  psin  y).  To 
rectify  this,  as  it  nears  the  end  of  the  current  strip,  it  takes  its  U-turn  a  distance  2 (d2—p  sin  y ) 
early. 


Figure  4.  Depiction  of  the  Strip  Loitering  policy.  The  segment  providing  closure  of  the  loitering  path  (returning 
the  UAVs  from  the  end  of  the  last  strip  to  the  beginning  of  the  first  strip)  is  not  shown  here  for  clarity  of  the 
drawing. 


departure 


Figure  5.  Close-up  of  the  Strip  Loitering  policy  with  construction  of  the  point  of  departure  and  the  distances 
d,  di,  and  cfe  for  a  given  target,  at  the  instant  of  appearance. 
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Theorem  17  For  any  convex,  compact  Q  C  M2  of  area  A  and  p,  v  G  R+,  the  Strip  Loitering 
policy  is  a  stabilizing  policy  in  the  light  load  case,  i.e.,  when  A/m  — >  0+.  Moreover,  if  Q  has 
a  bounding  rectangle  of  width  W  and  height  H ,  the  system  time  of  the  Strip  Loitering  Policy 
satisfies  the  following,  as  A/m  — »  0+. 


^SL  < 


1.238  f  pWH+W^8p_H\  ,  W±H±6A9p 
v  \  m  J  mv 

V^if+10.38pif  ,  ]V±H±6A9p  ,  1.0  6p 
Apmv  mv  v 


for  m  >  0.471  {^Jr  +  10  38if^ , 
otherwise  . 


Furthermore,  for  any  A,  p,  v  G  R+, 


limsupTsL^1^3  <  (pWH  +  10.38 p2H")1//3. 

m^+oo  ^ 


(14) 


(15) 


Proof:  For  A/m  — >  0+,  after  an  initial  transient,  all  targets  are  generated  with  i)  the 
vehicles  at  their  designated  position  on  the  loitering  pattern,  and  ii)  an  empty  target  queue, 
with  probability  one.  Therefore,  the  waiting  time  for  a  new  target  after  that  is  finite.  Hence, 
the  Strip  Loitering  policy  is  stabilizing  in  light  load.  In  the  following,  an  expression  for  the 
system  time  is  derived.  Denote  the  length  of  the  closed  path  as  L\.  Due  to  the  equal  spacing 
of  the  UAVs  along  the  loitering  path, 


E  K] 


Ln_ 

2m 


(16) 


Let  A^tripS  be  the  number  of  strips,  Z/Strip  be  the  length  traveled  along  a  single  strip,  Lu_turn 
be  the  length  of  a  u-turn  and  Lc iosure  be  the  length  of  the  closure  path  segment.  With  these 
notations,  Li  can  be  bounded  as 


L 1  T  A/gtripsTstrip  “1“  ( A/f.rips  l)Tu_turn  +  Z/c]osure. 


(17) 


It  is  easy  to  verify  that  Astrips  =  \^}  <  ^  +  1  and  Lstrip  <  IF  +  2p.  where  the  2 p  term  arises 
from  the  fact  that  the  vehicles  must  start  on  a  strip  outside  Q  in  order  to  visit  a  target  that 
might  appear  on  the  boundary  of  Q  in  front  of  it.  Bounds  on  the  length  of  the  optimal  point- 
to-point  Dubins  path  from13  give  us  that  Lu_turn  <  w  +  nirp  and  Lciosure  <  W  +  H  +  2p  +  mrp, 
where  k  &  8/3.  Substituting  these  bounds  into  Eq.  (17)  and  taking  into  account  Eq.  (16), 


E[di]  < 


v  v  ±  ±  | 


2  mw 


+ 


m 


(18) 


To  calculate  E  [d2\ ,  define  5  as  the  smallest  distance  from  the  target  to  any  point  on  the 
loitering  path  (see  Fig.  5).  Since  d2(s)  =  2psin_1(^/^)  for  s  <  p  and  A  is  uniformly 
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distributed  between  0  and  w/2, 


rw  /  2 


E  [d2]  =  —  I  sin"1 
w  Jo 


o 

2  p  )  W  L 


\fx  —  x2  +  (— 1  +  2/c)  sin  1(v/^)  ,  (19) 

where  a;  =  w  /  (4  p).  It  can  be  shown  that,  for  all  x  G  [0,1/2], 


r~ -  X 

V 1  —  x  <  1 - and 

“2 


sin 


-l 


(Vx)  >  \[x  + 


Substituting  these  bounds  into  Eq.  (19)  and  through  further  algebraic  simplification,  for 
w  <  2p , 

E[d2]<^^pw.  (20) 

As  noted  earlier,  for  A/m  — »  0+,  after  an  initial  transient,  all  targets  are  generated  with  the 
vehicles  on  the  closed  loitering  path,  and  an  empty  target  queue.  The  system  time  satisfies 


^  „E[<q  +  E[<y 

J-  SL  S  - 

V 


(21) 


Therefore,  from  Equations  (21),  (18)  and  (20),  for  w  <  2 p,  in  the  limit  as  A/m  — »  0+, 


^sl  < 


WH  +  10.38pH  W  +  H  +  6.19p 
2  mvw  mv 


(22) 


In  order  to  get  the  least  upper  bound,  the  right  hand  side  of  Eq.  (22)  is  minimized  with 
respect  to  w  subject  to  the  constraint  that  w  <  2 p.  The  value  of  w  at  which  Tsl  attains 
its  minimum  value  is  given  by  Eq.  (13)  and  the  minimum  value  is  given  by  Eq.  (14).  For  a 
given  A  G  M+,  as  m  — >  +oo,  A/m  — >  0+.  The  result  in  Eq.  (15)  then  follows  by  taking  the 
limit  as  m  — >  +oo  in  Eq.  (14). 


Remark  18  Theorem  11  and  Theorem  17  imply  that,  in  the  light  load  case  and  for  a  fixed 
Q  and  p,  Topt  belongs  to  ©(1  /(vf/m)). 

Remark  19  In  the  light  load  case,  when  the  turning  radius  p  is  large  compared  to  the  di¬ 
mensions  of  the  region  Q,  Eq.  (14)  implies  that  Topt  belongs  to  0{1/ {mv)). 

Simulation  Results  for  the  SL  policy 

Simulations  of  the  SL  policy  were  run  for  various  values  of  m.  For  each  condition,  the  system 
time  shown  in  Fig.  6  is  the  average  of  the  waiting  times  of  one-hundred  thousand  target 
points  uniformly  generated  in  the  environment.  These  numerical  results  confirm  that  the 
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system  time  shrinks  with  1/m1/3:  the  log- log  plot  has  a  slope  of  —0.34,  implying  a  power 
law  of  approximately  —1/3. 


Figure  6.  Simulations  of  the  SL  policy  were  run  with  a  square  environment  of  area  A  =  9  X  106  m2,  vehicle 
speed  v  =  50  ms-1,  minimum-turning  radius  p  =  600  m,  and  m  varying  from  50  to  450.  The  nonholonomic 
vehicle  densities  for  the  scenarios  of  all  data  points  shown  lie  within  the  range  2  <  dp  <  18. 


IV.  Heavy  Load 

In  this  section,  performance  bounds  are  studied  and  algorithms  are  presented  for  the  case 
in  which  the  target  generation  rate  is  high. 


IV. A.  Lower  Bound 


Leveraging  the  results  on  the  small-time  reachable  set  derived  in  previous  sections,  a  lower 
bound  on  the  system  time  for  any  policy  in  the  heavy  load  case  can  be  stated.  This  lower 
bound  explicitly  takes  the  nonholonomic  constraint  into  account. 

Theorem  20  For  any  convex,  compact  Q  C  K2  of  area  A  and  p,  v  £  R+,  the  system  time 
for  the  Dubins  DTRP  satisfies 


lim  inf  Topt 

A _ 


m3  81  pA 
A2  —  64  v3 


Proof:  Under  the  actions  of  any  stabilizing  policy,  the  number  of  outstanding  targets 
approaches  a  finite  steady-state  value,  n,  related  to  the  system  time  by  Little’s  Formula 
(1).  Consider  the  m- vehicle  system  in  arbitrary  configuration.  In  the  following,  a  bound  is 
computed  on  the  expected  time  required  for  a  given  vehicle  i  to  visit  the  closest  target,  in  the 
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Dubins  distance  sense.  Call  this  target  the  nearest-neighbor,  and  call  such  a  time  tnn.  If  the 
vehicle  is  given  a  time  t  to  move,  the  probability  that  tnn  >  t  is  no  less  than  the  probability 
that  there  are  no  targets  reachable  within  time  t.  In  other  words, 


^  .  Area(i?t(oj))  (vt)3 


Defining  c  =  ( nv3)/(3pA ), 


poo 

E  [tnn]  =  /  Pr(tnn  >  £)  d £ 

>  ja  max  (o.  1  -  n^) 

r-c-V  3 


(1  -  cC3) 


3  /  3pA  \  1/3 

4  v  \  n  J 


If  each  vehicle  visits  its  nearest-neighbor  target,  then  in  expectation,  m  targets  are  visited 
within  time  tnn.  Assuming,  towards  a  lower  bound,  that  targets  are  continually  serviced  at 
a  rate  of  m/tnn,  resulting  in  a  steady-state  number  of  outstanding  targets  nmi, 

m  Avm,  f  nnn  \  ^3 

tnn  3  \ZpA) 


Substituting  Little’s  formula,  i.e.,  nrm  =  ATnn,  and  rearranging,  the  claim  follows  from 

Eopt  >  Tnn.  m 

Note  that  the  system  time  depends  quadratically  on  the  parameter  A,  whereas  in  the 
Euclidean  case  it  depends  on  it  linearly.  As  a  consequence,  bounded-curvature  constraints 
make  the  system  time  more  sensitive  to  increases  in  the  target-generation  rate.  Perhaps 
the  most  striking  consequence  of  the  above  result  is  that  the  lower  bound  suggests  that  the 
system  time  decreases  with  the  cube  of  the  number  of  vehicles,  whereas  in  the  Euclidean 
case  it  decreases  with  the  square  of  the  number  of  vehicles. 


IV. B.  The  Bead  Tiling  (BT)  Policy 

In  Savla  et  a/.,13  the  single- vehicle  Dubins  DTRP  heavy- load  case  is  analyzed.  A  policy 
called  the  Bead  Tiling  Algorithm  (BT)  Policy  is  shown  to  have  a  performance  within  a 
constant  factor  of  the  optimal  for  rectangular  environments.  In  the  following,  the  BT  policy 
is  adapted  for  general  convex  environments,  and  a  method  by  which  the  algorithm  can  be 
used  by  a  multiple-vehicle  system  is  presented.  For  the  sake  of  completeness,  a  modified 
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form  of  the  BT  policy  from13  is  stated  for  general  convex  Q,  and  the  corresponding  upper 
bound  on  the  system  time  for  the  Dubins  DTRP  is  obtained  from  its  analysis. 

The  BT  policy  makes  use  of  the  shape  detailed  in  Fig.  7,  referred  to  as  a  “bead”.13  The 
algorithm  works  by  tiling  the  plane  with  identical  beads.  In  choosing  the  orientation  of  the 
bead  rows,  we  use  the  same  minimum-height  bounding  box  of  Q  with  width  W  and  height 
H,  as  described  in  the  SL  policy.  The  Dubins  vehicle  visits  all  beads  intersecting  Q  in  a  row- 
by-row  fashion  in  top-to-bottom  sequence,  servicing  at  least  one  target  in  every  non-empty 
bead.  This  process  is  repeated  indefinitely.  The  size  of  the  beads  (and  hence  the  total  number 
intersecting  Q)  is  scaled  with  the  target  generation  rate  by  choosing  £  =  mm{GbTAi'/h,  4 p}, 
where 


Cbta  — 


7-V17 


3A  ) 


is  a  design  parameter  chosen  to  maximize  the  performance  of  the  algorithm.  Let  Tbt  be  the 


Figure  7.  The  bead. 


system  time  for  the  single-vehicle  BT  policy.  The  following  theorem  is  a  generalization  of 
the  corresponding  result  from  Savla  et  al.13  for  any  convex  Q. 

Theorem  21  For  any  convex,  compact  Q  C  M2  of  area  A  and  A ,p,v  E  M+,  the  single- 
vehicle  Bead  Tiling  policy  is  a  stabilizing  policy.  Moreover,  if  Q  has  a  bounding  rectangle  of 
width  W  and  height  H ,  the  system  time  for  the  single-vehicle  Bead  Tiling  policy  satisfies 


lim 

A — ^-)-oo 


2”bt 


<  71^ 

V 6 


3 


The  BT  policy  is  extended  to  the  nn- vehicle  Bead  Tiling  (mBT)  policy  in  the  following 
way.  Divide  the  environment  into  regions  of  dominance  with  lines  parallel  to  the  bead  rows. 
Let  the  area  and  height  of  the  i- th  vehicle’s  region  be  denoted  with  Af  and  Hi.  Place  the 
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Figure  8.  An  adaptation  of  the  BT  policy  for  the  multiple-vehicle  case. 


subregion  dividers  in  such  a  way  that 

A  +  l^pHi  =  —  ( A  +  lirpH 
3  m  \  6 

for  all  i  in  Im.  Allocate  one  subregion  to  every  vehicle  and  let  each  vehicle  execute  the  BT 
policy  in  its  own  region.  Let  Tn ,rt  be  the  system  time  for  the  mBT  policy. 


Theorem  22  For  any  convex,  compact  QcR2  of  area  A  and  A ,p,v  E  M+  and  m  eN,  the 
mBT  policy  is  a  stabilizing  policy.  Moreover,  if  Q  has  a  bounding  rectangle  of  width  W  and 
height  H,  the  system  time  for  the  mBT  policy  satisfies 


lim  T, 

— — >H“0o 


—  m 


mBT  “TV  ^ 


A2 


<  71 


pA 


i  +  2^V 

3  A  ) 


Proof:  The  fact  that  the  mBT  policy  is  stabilizing  follows  from  the  stabilizing  property 
of  the  single- vehicle  BT  policy.  Denote  the  target-generation  rate  and  system  time  for  targets 
generated  in  the  Tth  vehicle’s  region  as  A*  and  T, .  respectively.  Also,  let  A,  be  the  area  of 
the  i- th  vehicle’s  region,  in  which  case,  \  =  A  A/ A.  Given  the  definition  of  the  mBT  policy, 
A/m  — >  Too  implies  that  \  — >  +oo  for  all  i  E  Im .  By  Theorem  21,  for  all  i  E  Im, 


-  llpAi  (  7tt pHi\3  2 

r'-  — l1  +  ^4rj  A-  ">x/m 


Too. 


Substituting  A*  =  A  A/ A,  for  all  i  E  Im, 


r  ^  71p  ( A  ,  7t rpHi\  2 
Ti~AAA  V Ai  +  )  A  ’  as  A/m 


Too. 


(23) 


However,  Ai  and  Hi  are  selected  in  the  mBT  policy  so  that  A*  T  BiHk  =  T  (A  T  77TfU )  ■ 
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Substituting  this  into  Eq.  (23),  for  all  i  G  Im, 


T-  < 

il  _  q  q 

m6v 6 


71pA{  7*pH\  asA/m^+0O. 

3  A  J  ' 


Note  that  Tt  is  independent  of  i,  i.e.,  the  system  time  in  each  of  the  vehicle’s  regions  is  the 
same.  The  theorem  then  follows  trivially.  ■ 


Remark  23  Theorem  20  and  Theorem  22  imply  that  the  system  time  for  the  Dubins  DTRP, 
in  the  heavy  load,  is  of  the  order  A 2/(mv)3,  and  that  the  mBT  policy  performs  within  a 
constant  factor  of  the  optimal  policy. 


Remark  24  There  exists  no  stabilizing  policy  for  the  Dubins  DTRP  when  the  targets  are 
generated  in  an  adversarial  worst-case  fashion  with  A/m  >  vj  (7rp).  This  fact  is  a  consequence 
of  the  linear  lower  bound  on  the  length  of  the  Dubins  Traveling  Salesperson  Tour  for  worst- 
case  point  sets  as  derived  in  Savla  et  aid5 


Simulation  results  for  the  m-vehicle  Bead  Tiling  Policy 

Simulations  of  the  mBT  policy  were  run  for  various  values  of  A  and  m.  For  each  combination 
of  conditions,  an  experiment  was  run  for  five-hundred  thousand  simulation  seconds.  The 
number  of  outstanding  targets  was  plotted  with  respect  to  time  to  ensure  that  the  data 
collected  represents  the  performance  of  the  policy  once  the  system  has  reached  steady  state. 
The  system  times  shown  in  Fig.  9  are  the  averages  of  the  waiting  times  of  all  targets 
generated  and  serviced  during  this  period.  Moreover,  the  results  are  verified  for  consistency 
by  substituting  the  known  target-generation  rate  and  the  measured  steady-state  number 
of  outstanding  targets  into  Eq.  (1)  to  obtain  the  system  time  via  an  alternative  method. 
Performance  results  from  simulations  of  the  mBT  policy,  summarized  in  Fig.  9,  validate 
theoretical  predictions  that  the  system  time  grows  quadratically  with  the  target  generation 
rate,  and  decreases  cubically  with  the  number  of  vehicles.  For  example,  for  rn  =  1  and  A 
increasing  from  1/12  s-1  to  1/4  s-1  (Fig.  9,  left),  note  a  slope  of  1.95  on  the  log-log  plot, 
implying  a  power  law  close  to  2.  For  A  =  1/4  s-1  and  m  increasing  from  1  to  4  (Fig.  9, 
right),  note  a  slope  of  —2.91  on  the  log-log  plot,  implying  a  power  law  close  to  —3. 


V.  Phase  Transition 

In  Section  III,  two  policies  were  proposed:  the  Median  Circling  policy  and  the  Strip 
Loitering  policy.  These  policies  exhibit  different  modes  of  behavior.  It  was  shown  that  the 
territorial  MC  policy  is  optimal  as  dp  — *  0+  and  the  gregarious  SL  policy  is  constant-factor 
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NUMBER  OF  VEHICLES 


Figure  9.  Simulations  of  the  mBT  policy  were  run  with  a  square  environment  of  area  A  =  3.6  X  107  m2,  vehicle 
speed  v  =  50  ms-1,  minimum-turning  radius  p  =  600  m,  number  of  vehicles  m  varying  from  1  to  4,  and  target- 
generation  rate  A  varying  from  1/12  s-1  to  1/4  s-1  in  increments  of  1/24  s-1.  Plots  of  the  system  time  versus 
target-generation  rate  (left),  and  system  time  versus  the  number  of  vehicles  (right)  are  given. 


optimal  asm->  +oo,  for  constant  p  and  Q.  This  suggests  the  existence  of  a  phase  transition 
in  the  optimal  policy  for  the  light  load  scenario  as  one  increases  the  number  of  vehicles  for 
a  fixed  p  and  Q. 
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Figure  10.  Simulation  results  aimed  at  demarcating  the  phase  transition  for  the  case  when  the  environment 
is  a  square  of  area  A  =  4x  108  m2.  The  color  of  the  data  point  signifies  which  policy  had  a  lower  system  time 
for  the  corresponding  values  of  the  parameters. 


Simulations  of  both  the  MC  and  SL  policy  were  run  with  a  square  environment  of  area 
A  =  4  x  108  m2  for  varying  values  of  rri  and  p.  For  each  combination  of  conditions  and  for 
each  policy,  the  system  time  was  taken  as  the  average  of  the  waiting  times  of  one-hundred 
thousand  target  points  uniformly  generated  in  the  environment.  For  the  MC  policy,  C2  =  2.91 
is  chosen  since  numerical  results  show  that  it  gives  better  performance  in  terms  of  ci,  as 
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reported  in  Remark  2.  The  number  of  vehicles  m  is  varied  between  1  and  100  and  the 
turning  radius  p  is  varied  such  that  the  ratio  A/p2  takes  values  between  1  and  about  2000. 
The  results  are  summarized  in  Fig.  10  showing  which  policy  performs  better  for  a  given 
value  of  m  and  p.  The  color  of  the  data  point  signifies  which  policy  had  a  lower  system 
time  for  the  given  conditions.  The  plot  illustrates  that  the  transition  curve  between  the 
two  policies  could  be  captured  by  a  straight  line  of  the  form  m  =  kA/p2  for  some  constant 
k  >  0,  thereby  highlighting  the  significance  of  the  nonholonomic  density,  dp  =  p2m/A  in 
capturing  the  transition.  Moreover,  the  slope  of  the  approximating  line  gives  an  estimate  of 
this  critical  value  of  the  nonholonomic  density  to  be  d*q-sim  ~  0.0925. 


It  is  desirable  to  study  the  fundamental  factors  driving  the  transition,  ignoring  its  depen¬ 
dence  on  the  shape  of  Q.  Towards  this  end,  envision  an  infinite  number  of  vehicles  operating 
on  the  unbounded  plane,  where  the  system  is  characterized  by  the  vehicle  density.  Denote 
the  inverse  of  the  density  with  a,  the  area  per  vehicle.  Depictions  of  the  MC  and  SL  policies 
operating  on  an  unbounded  domain  are  shown  in  Fig.  11.  In  this  case,  the  configuration 
p*  yielding  the  minimum  of  the  function  'H(p)  is  that  in  which  the  Voronoi  partition  in¬ 
duced  by  p*  is  a  network  of  regular  hexagons,49  each  of  area  a.  Each  generator  pt  is  the 
median  of  its  own  Voronoi  region,  and  it  is  known  that  if  Q  is  a  regular  hexagon  of  area  a 
then  Hl(Q)  ~  0.377 \/a.  The  system  time  of  the  policy  satisfies  Tmc  <  0.377-ya  +  3.76p. 
This  expression  is  obtained  by  using  the  value  of  C\  from  Remark  2  in  Theorem  13.  In 
this  scenario,  the  SL  policy  reduces  to  vehicles  moving  straight  on  infinite  strips,  where  the 
design  criteria  are  the  width  of  the  strips  w  <  2p,  and  the  distance  between  consecutive 
vehicles  in  the  same  strip  L.  The  system  time  of  the  policy  satisfies  T sl  <  L/ 2  +  3^/pw/A. 
The  area  per  vehicle  is  related  to  the  design  parameters  by  Lw  =  a.  Substituting  gives 
T sl  <  a/2w  +  3^pw/4.  Minimizing  with  respect  to  w,  with  the  constraint  that  w  <  2 p, 
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gives  w  =  min{(4a/3y/p)2/3,  2p},  yielding 


T sl  < 


1.238(pa)1/3  for  dp  >  0.471, 
a/4p  +  1.06p  otherwise  , 


(24) 


where  we  have  substituted  the  nonholonomic  vehicle  density  with  dp  =  p2/a.  Equating  the 
upper  bound  on  the  MC  policy  with  the  upper  bound  on  the  SL  policy  for  dp  <  0.471,  we 
get  0.377y/a  +  3.76p  =  a/4p  +  1.06p.  Dividing  both  sides  by  p  and  substituting  x  =  y/a/p, 
the  quadratic  formula  gives  x  ~  4.127  and  hence  the  critical  nonholonomic  vehicle  density 
is  given  by  d”rilxl~theory  =  1  / x2  m  0.0587.  This  is  a  lower  critical  density  than  implied  by  the 
simulation  results  on  a  square  environment,  favoring  the  SL  policy.  It  would  seem  that  the 
relaxation  of  the  environment  boundary  has  a  greater  impact  on  the  performance  of  the  SL 
policy,  no  longer  requiring  U-turns. 
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Figure  12.  Simulation  results  aimed  at  demarcating  the  phase  transition  for  the  case  when  the  environment 
is  an  unbounded  domain.  The  color  of  the  data  point  signifies  which  policy  had  a  lower  system  time  for  the 
corresponding  values  of  the  parameters. 


To  further  verify  the  above  result,  extensive  simulations  of  the  two  policies  operating  in 
an  unbounded  domain  were  run  for  various  values  of  a  and  p2,  and  Fig.  12  shows  which 
policy  has  the  lower  system  time  for  the  given  set  of  conditions.  For  the  MC  policy,  this 
reduces  to  simulating  a  single  vehicle  operating  on  a  regular  hexagon.  In  the  simulations 
of  the  MC  policy,  trials  were  run  varying  the  loitering  radius  (by  varying  c2)  for  a  given 
set  of  a  and  p2,  and  the  MC  policy’s  system  time  was  taken  as  the  lowest  value  obtained. 
Avoiding  excessive  detail,  it  should  be  noted  that  the  optimal  c2  is  also  a  function  of  dp 
and  that  for  all  conditions  under  which  the  MC  policy  is  dominant,  the  optimal  c2  is  one. 
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The  following  is  an  intuitive  explanation  for  this  fact:  the  MC  policy  is  dominant  when  the 
Euclidean  cost  is  dominant  and  a  choice  of  one  for  c 2  is  the  closest  the  Dubins  vehicle  can 
come  to  replicating  the  optimal  policy  of  the  Euclidean  vehicle  (waiting  at  the  median). 
For  the  SL  policy,  the  unbounded  domain  scenario  reduces  to  simulating  a  single  vehicle 
responsible  for  targets  appearing  in  the  region  of  the  strip  in  front  of  it,  as  dictated  by 
the  chosen  w  and  L  discussed  above.  Trials  were  run  varying  w  (and  hence  L  =  a/ w)  as  a 
fraction  of  2 p.  Experiments  suggest  that  the  optimal  w  is  2 p  for  all  conditions  near  the  phase 
transition.  This  agrees  with  the  theoretical  analysis  above  regarding  the  unbounded  domain 
case.  Also,  once  a  target  appears,  the  vehicle  is  allowed  to  take  a  minimum-time  path  rather 
than  the  service  method  given  in  the  formal  description  of  the  SL  policy.  This  practice 
lends  itself  to  a  fairer  comparison  between  the  two  policies.  For  each  set  of  conditions,  for 
each  policy,  and  for  each  trial  varying  design  parameters  (02  for  the  MC  policy  and  w,  L 
for  the  SL  policy),  the  system  time  was  taken  as  the  average  of  the  waiting  times  of  one- 
hundred  thousand  target  points  uniformly  generated  in  the  vehicle’s  region  of  responsibility. 
Figure  12  provides  a  stronger  justification  for  approximating  the  phase  transition  with  a 
line,  and  hence  the  nonholonomic  density.  The  slope  of  the  transition  line  in  Fig.  12  implies 
a  critical  nonholonomic  density  of  ct™nbd~sim  ~  0.0759.  This  is  within  a  factor  of  1.3  from 
(jujiJj<i— theory .  fulyjlcr  favoring  the  MC  policy.  This  discrepancy  is  ascribed  to  the  conservative 
use  of  a  strict  upper  bound  on  Lp(g,  q )  in  Theorem  13  rather  than  taking  its  expected  value 
over  a  given  domain,  as  is  effectively  done  through  simulation.  To  gain  further  intuition  on 
the  nature  of  this  critical  condition,  consider  the  area  per  vehicle  acnt  =  p2/dcJlt,  which  yields 
aunbd— theory  ^  5.42? j-p2  and  aunbd-sim  ^  4.197 rp2.  In  other  words,  the  transition  occurs  when 
each  vehicle  is  responsible  for  a  region  of  area  5.42  or  4.19  times  that  of  a  minimum  turning- 
radius  disk.  Note  that  the  value  of  d^nbd~sim  is  within  a  factor  of  0.821  from  d*q-sim.  The 
relative  closeness  between  the  values  of  dp  obtained  via  different  methods  gives  a  compelling 
guideline  in  the  form  of  dp  and  its  critical  value  that  can  be  used  by  a  system  architect  to 
decide  upon  the  optimal  strategy  in  the  light  load  scenario  for  given  problem  parameters. 

VI.  Conclusions 

In  this  paper,  several  routing  problems  in  stochastic  time-varying  environments  were 
introduced,  for  vehicles  that  move  with  constant  forward  speed  along  paths  of  bounded 
curvature  in  the  plane.  For  each  of  them  we  derived  fundamental  limits  on  the  achievable 
performance  in  terms  of  system  time,  as  a  function  of  problem  parameters  such  as  the  size 
of  the  fleet,  the  minimum  turning-radius,  the  target-generation  rate,  the  vehicle  speed  and 
the  size  of  the  environment.  In  addition,  for  each  problem,  we  presented  routing  algorithms 
that  provably  approximate  the  optimal  performance. 
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Specifically,  it  has  been  shown  that,  in  light  load,  the  optimal  system  time  is  proportional 
to  the  inverse  cube  root  of  the  number  of  vehicles,  contrasting  with  the  holonomic  case  where 
it  is  proportional  to  the  inverse  square  root.  This  implies  that  the  nonholonomic  motion 
constraints  inhibit  the  added  performance  gained  per  vehicle.  In  low  density  scenarios, 
the  optimal  policy  is  one  of  territorial  behavior,  but  as  the  density  increases,  a  gregarious 
behavior  is  required  to  reap  the  benefits  of  a  large  team.  In  heavy  load,  it  was  shown  that  the 
system  time  is  of  order  proportional  to  the  square  of  the  target  appearance  rate,  and  inversely 
proportional  to  the  cube  of  the  number  of  vehicles,  whereas  it  is  known  to  be  proportional  to 
the  target  appearance  rate  and  inversely  proportional  to  the  square  of  the  number  of  vehicles 
in  the  holonomic  case.  This  implies  that  the  system  time  in  the  nonholonomic  case  is  more 
sensitive  to  the  target  generation  rate  as  well  as  the  size  of  the  vehicle  fleet  and  vehicle  speed 
than  the  holonomic  case.  A  phase  transition  was  recognized  in  the  optimal  algorithm  and 
numerical  experiments  were  performed  towards  demarcating  the  parameter  values  at  which 
this  phase  transitions  occur.  This  led  to  a  set  of  guidelines  available  to  system  architects  and 
control  algorithm  designers  aiming  to  maximize  the  performance  of  their  autonomous- vehicle 
systems.  The  routing  algorithm  can  be  chosen  according  to  the  size  of  the  vehicle  team.  On 
other  hand,  depending  on  the  conditions  of  the  operating  environment,  a  designer  might  use 
the  analysis  provided  to  decide  if  the  additional  cost  of  another  vehicle  is  worth  the  added 
value  in  performance  it  will  bring. 

In  the  future,  various  extensions  of  the  problem  considered  in  this  paper  will  be  studied. 
For  example,  nonholonomic  dynamics  will  be  combined  with  constraints  on  the  information 
available  to  the  vehicles  by  limiting  communication  and  sensing  capabilities.  Another  im¬ 
portant  extension  is  the  inclusion  of  obstacles  or  no-fly  zones  restricting  the  motion  of  the 
UAVs.  The  focus  of  this  paper  was  on  the  analysis  of  fundamental  performance  bounds,  and 
on  the  design  of  algorithms  achieving  provable  approximations  of  the  optimal  performance. 
Avenues  for  future  work  include  further  work  on  efficient  implementations  of  the  proposed 
algorithms,  including  for  example  the  design  of  decentralized  strategies  for  the  individual  ve¬ 
hicles  which  give  rise  to  a  group  behavior  that  is  near  optimal  for  the  system  as  a  whole.  A 
game-theoretic  approach  might  lend  itself  naturally  to  such  a  completely  distributed  control 
paradigm.  The  understanding  of  phase  transitions  gained  from  this  work  can  aid  in  studying 
similar  phenomena  in  biological  systems.  Finally,  experimental  validation  of  the  proposed 
algorithms  on  fixed-wing  UAV  platform  is  currently  being  pursued  using  commercial  off-the- 
shelf  autopilots. 


34  of  42 


Routing  Problems  for  Multiple  UAVs,  Enright  et  al. 


A.  The  Dubins  distance  function 


In  this  section,  a  brief  exposition  about  the  Dubins  distance  function  Lf)(g.  q )  is  given. 
Since  the  Dubins  distance  is  invariant  under  transformation  of  coordinates  from  the  global 
frame  to  the  Dubins  frame,  without  loss  of  generality,  attention  can  be  focusd  on  Lp(g0 ,  q), 
where  g0  =  (0,  0,  0)  and  q  is  the  coordinate  of  the  destination  point  in  the  Dubins  frame. 
Define  D+  :=  {q  G  R2  :  ||g-(0,p)||  <  p},  V~  :=  {qeR2  :  ||g-(0, -p)||  <  p}.  If  q  G  V+UV~ 
the  optimal  path  is  of  type  CC,  otherwise  it  is  of  type  CL.  The  following  proposition  gives 
a  closed  form  expression  for  Lp(g0,  q). 


Proposition  25  The  minimum  length  Lp(g0,  q)  of  a  Dubins  path  steering  a  vehicle  from 
go  =  (0, 0,  0)  G  SE(2)  to  a  point  q  =  (x,  y )  G  M2  is  given  by: 


LP(g0,  q)  =  { 


~  P2  +  P  (  9c(q)  ~  arccos  j- J  , 

-  .  dc(q)sm0c(q )  .  psina(g) 

p  I  2n  —  a{q)  +  arcsm - - b  arcsm 


Mq) 


df(q ) 


for  q<£V+  UVp- 
otherwise, 


where  dc(q)  =  \fx*  +  (jyj  —  p)2  and  0c(q)  =  atan2(x,p  —  |y|)  are  polar  coordinates  of  the 
point  q  with  respect  to  the  center  of  either  Vp  or  V ']t,  whichever  is  the  closest,  df(q)  = 
\fx?  +  (jy|  +  p)2  is  the  distance  of  q  from  the  other  center,  and 


a(q) 


=  arccos 


5p2  -  df  (q)2 
4p2 


Proof:  The  proof  is  based  on  the  results  in  Thomaschewski46  mentioned  in  the  paper, 
and  elementary  planar  geometry.  ■ 

The  facts  stated  in  the  following  remarks  are  consequences  of  Proposition  25. 


Remark  26  The  level  sets  of  the  function  Lp(g0,q )  are  segments  of  well-studied  curves. 
More  precisely,  level  sets  of  Lp(g0,q)  are: 

•  Arcs  of  circle  involutes,  for  q  ^  V+  U  V~ ; 

•  Arcs  of  epicycloids,  for  q  G  Vp  U  T>~ . 

Recall  that  a  circle  involute  is  the  curve  traced  by  a  point  fixed  to  a  line,  as  the  line  rolls 
without  slipping  on  a  circle;  an  epicycloid  is  the  curve  traced  by  a  point  fixed  to  a  circle, 
as  the  circle  rolls  without  slipping  on  another  circle.  For  further  details  on  these  families  of 
curves  see,  e.g.,  Lawrence.57  A  depiction  of  such  level  sets  is  provided  in  Fig.  13. 


Remark  27  The  function  Lp(g0lq)  is: 
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y 


Figure  13.  The  level  sets  of  the  function  Lp(go,q),  for  p  =  1,  shown  at  increments  of  7r/4.  The  initial  vehicle 
configuration  is  represented  by  the  black  triangle,  and  is  at  coordinates  (0,  0)  heading  up.  Note  that  the  positive 
direction  of  the  x  axis  is  up  and  of  the  y  axis  is  left. 


•  not  continuous  on  the  top  half  of  the  boundary  of  V+  and  T>  ,  i.e.,  for  {(x,y)  G 
V+  U  V-  :  X  >  0}; 

•  continuous,  but  not  differentiable  on  the  bottom  half  of  the  boundary  of  the  same  set, 
i.e.,  for  {( x,y )  €  V+  U  T>~  :  x  <  0},  and  on  the  negative  part  of  the  x-axis,  i.e.,  for 
{(x,y)  :  x  <  0,y  =  0}; 

•  continuous  and  differentiable  everywhere  else. 

Remark  28  The  length  of  the  optimal  path  to  reach  a  point  inside  one  of  the  circles  Vp  is 
at  least  up;  furthermore,  one  can  verify  that 

sup  Lp(g0,  q)  =  p\2ir  + 2  arctan  —  arccos  -  )  ~  6.5906p, 
gevf  V  9  8  J 

and  that  such  a  supremum  is  attained  in  the  limit  as  q  approaches,  from  within  V+ ,  the  point 

qsuP  =  ^  e  dV+P  ■ 

Because  of  symmetry,  a  similar  result  holds  within  T>~ . 

As  is  apparent  from  the  preceding  discussion,  the  function  Lp  is  fairly  complicated.  The 
next  result  establishes  lower  and  upper  bounds  on  Lp(g0,  q)  based  on  the  Euclidean  norm  of 
q. 

Proposition  29  For  any  q  =  (x,y)  G  M2,  the  following  holds: 

IHI  +  p(0(q)  ~  sin  0(<l))  <  Lp(9o ,  q)  <  Ikll  +  2 TTp,  (25) 
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where  d(q)  =  atan2(|y|,rc). 


Proof:  The  lower  bound  is  proved  first.  If  q  G  Vp  U  V~,  then  ||g||  <  2psin  0(q),  and 
hence 

Ikll  +  p(Q(q)  ~  sin  6(q))  <  p(6(q )  +  sin 6{q)  <  up. 

However  Lp(g0,  q )  >  up  within  T>p  U  T>~,  which  proves  that  the  bound  holds  in  this  case. 

If  q  f  Vp  U  T>~ ,  proceed  in  the  following  way.  Define  the  point  q'  as  depicted  in  Fig.  14, 
i.e.,  by  translating  the  segment  from  the  origin  to  q  normally  to  itself,  until  it  is  tangent  to 
the  minimum-radius  circle  dVp.  Since  Lp(g0,q')  =  pd(q)  +  ||g||  —  psin9(q),  the  bound  holds 
in  this  case  as  well. 


Figure  14.  Constructions  for  the  lower  bound  (left)  and  upper  bound  (right)  on  Lp. 


The  proof  for  the  upper  bound  is  constructive:  consider  a  path  of  type  CLC  in  which  the 
first  subpath  has  arc  length  p9(q),  the  linear  subpath  has  length  ||g||,  and  the  final  subpath 
has  arc  length  p(2n  —  9(q)).  Such  a  path  terminates  at  the  desired  point  q,  and  has  the 
required  length.  ■ 


B.  Technical  lemmas  for  Theorem  10 

In  this  section,  the  technical  lemmas  used  in  the  proof  of  Theorem  10  are  presented. 

Lemma  30  Consider  Hi,H2,A3  G  M+  such  that  A3  >  A2  >  A\  >  0,  then  the  average 
Dubins  distance  satisfies  the  following  monotonic  relationship: 

(Jq  <  fRA3(9i)\RA1(9i)Lp(9i,q}  ^ 

A2  —  A\  A3  —  A\ 
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Proof:  Let  L  :=  - .  One  can  then  write 

IraA^ARaA n)Lp(9hQ)  d(l  (ffO\-^A2 (ffO  dq  +  f RA2(g.)\RAi(g.)  LpjgiPd  dq 

A3  —  A\  A3  —  A\ 

>  fRAsigiPRA^^)  Lp(9ii  $  dq  +  L{M  ~ 

A3  —  A\ 


For  any  q  G  RA3(gi )  \  Ra2{9% ),  Lp(guq )  >  L.  Therefore, 

/flA3(gi)\flAl(gi)  -M&’  g)  d(l 
As  —  A 1 


>  L. 


Lemma  31  The  function  defined  in  Eq.  (8)  is  convex. 

Proof:  Consider  Ai,  A3  G  R+  such  that  A3  >  A\  and  a  G  (0, 1).  Define  A2  :=  «Ai  +  (l  — 
a)A3.  Convexity  is  shown  by  proving  that  for  any  a  G  (0, 1),  f(A2)  <  af(Ai)  +  (1  —  a)f(A3) 
(the  proof  for  the  case  when  a  =  0  or  a  =  1  is  trivial).  One  can  write 


f(Ai) 
f(A2) 
f  (^3) 


,RA1(gi) 


' RA1(gi ) 


Lp(gi,q )  dq, 
Lp{gi,q )  dq- 


'  Ra1  ( 9i ) 


Lp{gi,q )  dq  + 


'  RA2(9i)\RA1(9i) 

[ 

1  RA3(9i)\RA1(9i) 


LP(g»q)  dq, 
Lp(gi,q)  dq. 


Now  use  Lemma  30  to  derive  the  relation 


/  Lp{gh  q)  dq<(l-a)  /  Lpfa,  q)  dq, 

' Ra2  (gi)\RA1  ( 9i )  J  Ra3  (9i)\Ra1  ( 9i ) 
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and,  in  turn, 


f(A2)=  Lp(gi,q)dq+  Lp(gi:q)  dq 

JRa1  ( 9i )  JRa2  (9i)\RA1  ( 9i ) 

<  /  Lp(gi,q)  dq+  (1  -  a)  Lp(gi,q)  dq 

J RAl  ( 9i )  J Ra3  ( 9i)\RA1  (,9i) 


=  a  Lp(gi,q)  dq  +  (1  —  a)  I  /  Lp(ghq)dq+  I  Lp{ghq)  dq 

J  R  A,  (Qi)  \  JRAl(9i)  J  RA3{9i)\RA1(9i) 


,Ra1(9z) 

=  ®f(Ai)  +  (1  —  a)f(As). 


Therefore,  /  is  convex. 
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